Improved monaural speech segregation based on computational auditory scene analysis
نویسندگان
چکیده
A lot of effort has been made in Computational Auditory Scene Analysis (CASA) to segregate target speech from monaural mixtures. Based on the principle of CASA, this article proposes an improved algorithm for monaural speech segregation. To extract the energy feature more accurately, the proposed algorithm improves the threshold selection for response energy in initial segmentation stage. Since the resulting mask map often contains broken auditory element groups after grouping stage, a smoothing stage is proposed based on morphological image processing. Through the combination of erosion and dilation operations, we suppress the intrusions by removing the unwanted particles and enhance the segregated speech by complementing the broken auditory elements. Systematic evaluation shows that the proposed segregation algorithm improves the output signal-to-noise ratio by an average of 8.55 dB and cuts the percentage of noise residue by an average of 25.36% compared with the mixture, yielding a significant improvement for speech segregation.
منابع مشابه
On Amplitude Modulation for Monaural Speech Segregation
We propose a computational auditory scene analysis (CASA) model for monaural speech segregation. It deals with low-frequency and high-frequency signals differently. For high-frequency signals, it generates segments based on common amplitude modulation (AM) and groups them according to AM repetition rates. This model performs substantially better than previous CASA systems.
متن کاملAn Auditory Scene Analysis Approach to Monaural Speech Segregation
A human listener has the remarkable ability to segregate an acoustic mixture and attend to a target sound. This perceptual process is called auditory scene analysis (ASA). Moreover, the listener can accomplish much of auditory scene analysis with only one ear. Research in ASA has inspired many studies in computational auditory scene analysis (CASA) for sound segregation. In this chapter we intr...
متن کاملMonaural segregation of voiced speech using discriminative random fields
Techniques for separating speech from background noise and other sources of interference have important applications for robust speech recognition and speech enhancement. Many traditional computational auditory scene analysis (CASA) based approaches decompose the input mixture into a time-frequency (T-F) representation, and attempt to identify the T-F units where the target energy dominates tha...
متن کاملIntegrating Monaural and Binaural Cues for Sound Localization and Segregation in Reverberant Environments
The problem of segregating a sound source of interest from an acoustic background has been extensively studied due to applications in hearing prostheses, robust speech/speaker recognition and audio information retrieval. Computational auditory scene analysis (CASA) approaches the segregation problem by utilizing grouping cues involved in the perceptual organization of sound by human listeners. ...
متن کاملAnalysis and Synthesis of Sinusoidal Noise in Monaural Speech Using CASA
CASA is the technique used to segregate a target speech from a monaural mixture. This article proposes a technique to separate the sinusoidal noise from monaural mixtures. Many sounds are there that are important to humans are having pseudo-periodic structure over a particular period /stretch of time. Where this fixed period is typically range of 100Hz-5KHz which gives the corresponding pitch p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- EURASIP J. Audio, Speech and Music Processing
دوره 2013 شماره
صفحات -
تاریخ انتشار 2013